HR Analytics Case Study

A large company named ETC, employs, at any given point of time, around 4000 employees. However, every year, around 15% of its employees leave the company. Since the attrition level is too high, the management wants to use predictive modelling to bring it down.

Hence, the objectives of the analysis are to: Help company XYZ identify current employees that are very likely to leave Recommend ways for company XYZ to decrease its attrition level in the future

The analysis is divided into three parts:

Data Understanding – Source of data, patterns in the data

Predictive modelling of attrition

Recommending ways for company XYZ to decrease its level of attrition

Coefficients of the variables Age and TotalWorkingYears are significant.


Age

-Employees aged 36 years and above are more likely to stay

-Employees aged 32 years and below are more likely to leave

Experience

-Employees that have worked for a total of 10 years or more are more likely to stay

-Employees that have worked for a total of 7 years or less are more likely to leave

Among attritions, median age = 32 and median exp. = 7

Among non-attritions, median age = 36 and median exp. = 10

Training and Years with Current Manager


Training

-Employees that got 3 or more training sessions last year are more likely to stay

-Employees that got 2 or fewer training sessions last year are more likely to leave

Years with Current Manager

-Employees that have spent 3 years or more under the same manager are more likely to stay

-Employees that have spent 2 years or less under the same manager are more likely to leave

-Coefficients of the variables TrainingTimesLastYear and YearsWithCurrManager are significant.

-Rest of the data is based on means/medians etc.

Job Satisfaction and Environment Satisfaction


Job Satisfaction

-Employees that have medium, high or very high levels of job satisfaction, are more likely to stay

-Employees that have low levels of job satisfaction, are more likely to leave

Environment Satisfaction

-Employees that have medium, high or very high levels of environment satisfaction, are more likely to stay

-Employees that have low levels of environment satisfaction, are more likely to leave

-Coefficients of the variables JobSatisfaction and EnvironmentSatisfaction are significant.

-Employees were asked to report their job satisfaction and work environment satisfaction levels in a survey.

Average Work Hours and Work Life Balance


Average Work Hours

-Employees that, on average work for 7.3 hours or less, are more likely to stay

-Employees that, on average work for 8.2 hours or more, are more likely to leave

Work Life Balance

-Employees that rated their work life balance as good, better or best, are more likely to stay

-Employees that rated their work life balance as bad, are more likely to leave

-Coefficients of the variables AverageWorkTime and WorkLIfeBalance are significant.

-Average work hours data is based on means/medians etc.

-Employees were asked to report their level of work life balance in a survey.

Monthly Incom and Percent Salary Hike

Monthly Income and Percent Salary Hike do not affect attrition Coefficients of these variables are not significant

---
title: "HR Analytics"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    theme: flatly
---
    
    ```{r setup, include=FALSE}
library(flexdashboard)
```

###**HR Analytics Case Study**

**A large company named ETC, employs, at any given point of time, around 4000 employees. However, every year, around 15% of its employees leave the company. Since the attrition level is too high, the management wants to use predictive modelling to bring it down.**

**Hence, the objectives of the analysis are to:**
*Help company XYZ identify current employees that are very likely to leave*
*Recommend ways for company XYZ to decrease its attrition level in the future*

**The analysis is divided into three parts:**

**Data Understanding – Source of data, patterns in the data**

**Predictive modelling of attrition**

**Recommending ways for company XYZ to decrease its level of attrition**


###**Coefficients of the variables Age and TotalWorkingYears are significant.**

```{r}
library(plotly)
final_data<-read.csv("final_data.csv")
plot1<- plot_ly(y=final_data$Age, color = final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "Age"))
plot2<- plot_ly(x=final_data$Attrition, y=final_data$TotalWorkingYears, color = ~final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "TotalWorkingYears"))
group1<- subplot(plot1, plot2, titleX = TRUE, titleY = TRUE, margin = 0.05)
group1
```


***

**Age**

-Employees aged 36 years and above are more likely to stay

-Employees aged 32 years and below are more likely to leave

**Experience**

-Employees that have worked for a total of 10 years or more are more likely to stay

-Employees that have worked for a total of 7 years or less are more likely to leave

*Among attritions, median age = 32 and median exp. = 7*
 
*Among non-attritions, median age = 36 and median exp. = 10*



###**Training and Years with Current Manager**

```{r}
plot3<- plot_ly(x=final_data$Attrition, y=final_data$TrainingTimesLastYear, color = ~final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "TrainingTimesLastYear"))
plot4<- plot_ly(x=final_data$Attrition, y=final_data$YearsWithCurrManager, color = ~final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "YearsWithCurrManager"))
group4<- subplot(plot3, plot4, titleX = TRUE, titleY = TRUE, margin = 0.05)
group4
```

***

**Training**

-Employees that got 3 or more training  sessions last year are more likely to stay

-Employees that got 2 or fewer training sessions last year are more likely to leave

**Years with Current Manager**

-Employees that have spent 3 years or more under the same manager are more likely to stay

-Employees that have spent 2 years or less under the same manager are more likely to leave

-*Coefficients of the variables TrainingTimesLastYear and YearsWithCurrManager are significant.*

-*Rest of the data is based on means/medians etc.*


### **Job Satisfaction and Environment Satisfaction**

```{r}
library(cowplot)
library(gridExtra)
library(ggplot2)
library(plotly)
bar_theme1<- theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5), legend.position="none")
g1<- ggplot(final_data, aes(x=EnvironmentSatisfaction,fill=Attrition)) + geom_bar(position = "fill") + labs(y="Proportion")
g2<-ggplot(final_data, aes(x=JobSatisfaction,fill=Attrition)) + geom_bar(position = "fill") + labs(y="Proportion")
grid1<- subplot(g1, g2, titleX = TRUE, titleY = TRUE, margin = 0.05)
ggplotly(grid1)
```

***

**Job Satisfaction**

-Employees that have medium, high or very high levels of job satisfaction, are more likely to stay

-Employees that have low levels of job satisfaction, are more likely to leave

**Environment Satisfaction**

-Employees that have medium, high or very high levels of environment satisfaction, are more likely to stay

-Employees that have low levels of  environment satisfaction, are more likely to leave

-*Coefficients of the variables JobSatisfaction and EnvironmentSatisfaction are significant.*

-*Employees were asked to report their job satisfaction and work environment satisfaction levels in a survey.*



### **Average Work Hours and Work Life Balance**

```{r}
g1<- plot_ly(x=final_data$Attrition, y=final_data$AverageWorkTime, color = ~final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "AverageWorkTime"))
g2<- ggplot(final_data, aes(x=WorkLifeBalance,fill=Attrition)) + geom_bar(position = "fill") + labs(y="Proportion")
grid1<- subplot(g1, g2, titleX = TRUE, titleY = TRUE, margin = 0.05)
ggplotly(grid1)
```


***

**Average Work Hours**

-Employees that, on average work for 7.3 hours or less, are more likely to stay

-Employees that, on average work for 8.2 hours or more, are more likely to leave


**Work Life Balance**

-Employees that rated their work life balance as good, better or best, are more likely to stay

-Employees that rated their work life balance as bad, are more likely to leave

-*Coefficients of the variables AverageWorkTime and WorkLIfeBalance are significant.* 

-*Average work hours data is based on means/medians etc.* 

-*Employees were asked to report their level of work life balance in a survey.*



###**Monthly Incom and Percent Salary Hike**
Monthly Income and Percent Salary Hike do not affect attrition
Coefficients of these variables are not significant


```{r}
plot5<- plot_ly(x=final_data$Attrition, y=final_data$MonthlyIncome, color = ~final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "MonthlyIncome"))
plot6<- plot_ly(x=final_data$Attrition, y=final_data$PercentSalaryHike, color = ~final_data$Attrition, type = "box")%>% layout(xaxis= list(title= "Attrition"), yaxis= list(title= "PercentSalaryHike"))
pp <- subplot(plot5, plot6, titleX = TRUE, titleY = TRUE, margin = 0.05)%>% layout(title = "Compare Monthly Income and Salary Hike with Attrition")
pp


```